New insights into permeability determination by coupling Stoneley wave propagation and conventional petrophysical logs in carbonate oil reservoirs

The need to determine permeability at different stages of evaluation, completion, optimization of Enhanced Oil Recovery (EOR) operations, and reservoir modeling and management is reflected. Therefore, various methods with distinct efficiency for the evaluation of permeability have been proposed by engineers and petroleum geologists. The oil industry uses acoustic and Nuclear Magnetic Resonance (NMR) loggings extensively to determine permeability quantitatively. However, because the number of available NMR logs is not enough and there is a significant difficulty in their interpreting and evaluation, the use of acoustic logs to determine the permeability has become very important. Direct, continuous, and in-reservoir condition estimation of permeability is a unique feature of the Stoneley waves analysis as an acoustic technique. In this study, five intelligent mathematical methods, including Adaptive Network-Based Fuzzy Inference System (ANFIS), Least-Square Support Vector Machine (LSSVM), Radial Basis Function Neural Network (RBFNN), Multi-Layer Perceptron Neural Network (MLPNN), and Committee Machine Intelligent System (CMIS), have been performed for calculating permeability in terms of Stoneley and shear waves travel-time, effective porosity, bulk density and lithological data in one of the naturally-fractured and low-porosity carbonate reservoirs located in the Southwest of Iran. Intelligent models have been improved with three popular optimization algorithms, including Coupled Simulated Annealing (CSA), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). Among the developed models, the CMIS is the most accurate intelligent model for permeability forecast as compared to the core permeability data with a determination coefficient (R2) of 0.87 and an average absolute deviation (AAD) of 3.7. Comparing the CMIS method with the NMR techniques (i.e., Timur-Coates and Schlumberger-Doll-Research (SDR)), the superiority of the Stoneley method is demonstrated. With this model, diverse types of fractures in carbonate formations can be easily identified. As a result, it can be claimed that the models presented in this study are of great value to petrophysicists and petroleum engineers working on reservoir simulation and well completion.

www.nature.com/scientificreports/ maximum recover factor from the studied reservoir; and determining the best reservoir drainage pattern and optimal drilling location. While having an absolute value is valuable for reservoir permeability, generating the well permeability profile is also of significant importance. However, achieving a well permeability profile is one of the most challenging engineering measurements. For this purpose, two direct and indirect methods have been invented in past decades. In the direct method, measurements are made at several points along the well, including well testing techniques, Repeat Formation Tester (RFT), and analysis of core samples taken from wells under different conditions. In the indirect method, permeability can be determined by interpreting various properties such as porosity, processing other logs such as Nuclear Magnetic Resonance (NMR) and geochemical logs, and using mathematical models with simplifying assumptions. Because these models are not accurate, their results are highly uncertain. The Stoneley acoustic wave analysis is the only direct and continuous measurement technique for permeability prediction along wellbores. Although the principles of Stoneley wave measurement have long been known, accurate and reliable measurement of the permeability by this method is still challenging 1 .
At low frequencies, the Stoneley waves become cylindrical, pushing borehole fluid piston-like into the formation. When the Stoneley waves reach the permeable or fractured zones, fluid displacement occurs between the wellbore and the formation. As a result, a decrease in energy level and damping of the wave are expectable, which lead to an increase in the wave slowness time. Permeable regions or fractures have a variety of properties and characteristics that affect Stoneley waves in different ways. In the case of permeable fractures, local and robust interferences create the chest pattern reflections in the Variable Density Log (VDL). Besides, unique methods have been developed by several authors to determine the fractures using the Stoneley wave analysis [2][3][4] .
In addition, it is very challenging to evaluate permeability in carbonate reservoirs because the factors influencing the permeability in carbonates are often different from those in sandstones. It is usually impossible to determine a good relationship between porosity and permeability in carbonate formations. Each method available in this field has advantages and disadvantages 5 . Numerous researchers have shown that the propagation of an acoustic wave in a well containing fluid is fundamentally different from the propagation of a flat wave in the vicinity of a single interference surface. This difference is due to the formation of different types of waves (i.e., surface and internal waves) in the well environment. An acoustic waveform contains valuable information. The most crucial wave components in a full waveform for measuring permeability are Stoneley and shear waves. The researchers found a relation between the Stoneley wave propagation in the well and the rock permeability 6,7 . They also showed a perfect match between high-frequency (~ 20 kHz) Stoneley wave energy and phase velocity, and attenuation of low-frequency (~ 1 kHz) Stoneley wave with core permeability 6 . Determining the permeability and how it is distributed is very important in many ways, including designing and completing wells for the successful implementation of Enhanced Oil Recovery (EOR) programs (e.g., water flooding) and simulating the reservoir model for optimizing its management 8 . Stoneley waves, which are classified in the group of guided waves, propagate at the interface between wellbore fluid and formation 9 . The effect of slowing down and dampening the Stoneley wave is a function of the frequency so that increasing the permeability increases the wave attenuation and slowness time. Over the past two decades, much progress has been made in acoustic well logging 1 . This advancement, including measurements of high-quality Stoneley waves and Signal-to-Noise Ratios (SNR), is performed using the Slowness/Time Coherence (STC) technique to process the Stoneley waves. This method is based on the development of the Semblance algorithm by Kimball and Marzetta 10 for acoustic wave processing.
Numerous researchers worldwide have carried out massive studies on the prediction of the permeability and mobility of geological formations using the Stoneley wave analysis method 7,[11][12][13][14][15][16][17][18][19][20][21][22][23][24] . Some of these studies are in the form of mathematical and analytical modeling, and some others utilized empirical/semi-empirical relationships to estimate the absolute rock permeability. Analytical models require a large number of inputs that may not be available and, or maybe estimated roughly. In addition, advanced mathematics is needed to solve these analytical models, which limits their applications. It should be noted that these complex models use several simplifying assumptions to make simpler their solutions, which leads to cumulative deviations from the real answer. Empirical/semi-empirical methods have also been of great importance in permeability calculations. The simplicity of such techniques is one of the main features of such techniques; however, empirical/semi-empirical methods do not have sufficient accuracy in estimating permeability and predicting its complex trend, especially in carbonate reservoirs with natural fractures. The models described above are usually developed for sandstone reservoirs or water-filled synthetic porous environments. To the best of the authors' knowledge, the researchers have rarely applied the aforementioned methods for naturally fractured carbonate formations around the globe. Therefore, there is a fundamental need for a new and universal method for calculating permeability in carbonate reservoirs with natural fractures. To answer this problem, powerful artificial intelligence methods can be used. Different studies in the literature have proven the successful application of these methods in various branches of science and engineering, especially geosciences and petroleum/chemical engineering [25][26][27][28][29][30][31][32][33][34] . Recently, several pieces of research have been implemented via diverse machine learning strategies to investigate the estimation of permeability via logging data 35,36 . These methods can be used to learn the behavior of scientific phenomena where there is not sufficient data to predict the desired parameters.
In the current investigation, a comprehensive modeling study was carried out by developing many artificial intelligence approaches, including Adaptive Network-Based Fuzzy Inference System (ANFIS), Least-Square Support Vector Machine (LSSVM), Radial Basis Function Neural Network (RBFNN), Multi-Layer Perceptron Neural Network (MLPNN), and Committee Machine Intelligent System (CMIS), for calculating permeability in terms of Stoneley and shear waves travel-time, effective porosity, bulk density and lithological data in one of the naturally-fractured and low-porosity carbonate reservoirs located in the Southwest of Iran. Three Evolutionary Algorithms (EAs), including Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Coupled Simulated Annealing (CSA), are combined with the prementioned predictive techniques to optimize the modeling. To the best of the authors' knowledge, it is the first time in literature that these kinds of modeling strategies are www.nature.com/scientificreports/ carried out for permeability estimation regarding the Stoneley wave analysis in naturally-fractured and lowporosity carbonate reservoirs. Several statistical analyses and graphical means are used to show the performance of the extended techniques via comparing to NMR permeability estimation methods. There is a bulk of literature studies focusing on permeability prediction via only conventional petrophysical logs such as Neuron (NPHI), Density (RHOB), and Sonic travel time (DT). Nonetheless, the application of advanced sonic tools such as Dipole Shear Sonic Imager (DSI), which prepares several curves (e.g., shear and Stoneley slowness), has not yet been comprehensively studied. In fact, joint use of fullset logs and advanced array sonic logs such as DSI (e.g., shear and Stoneley slowness) in association with a large number of core data is investigated in this study, which distinguishes this research from the available bulk of literature. To make this study comprehensive, several combinations of predictive and optimization approaches as hybrids are developed. Comparing the best model here (termed as CMIS model) with NMR-derived permeability models is another advantage and superiority of this study over other techniques. At last, it is beneficial to express that this study is investigated in low permeability and Naturally-Fractured Carbonate (NFR) oil reservoirs in Southwest Iran; thereby, it could strongly help to detect semi-filled fractures, micro-fractured, and vuggy media easily due to the anomalies occurring in the behavior of Stoneley wave slowness which propagates in the interface of borehole and formation. Vugs and fractures are the main reasons for large energy dissipations happening in the sound wave propagation. Stoneley waves show this phenomenon in the best manner for fracture detection.

Data gathering
Collecting a comprehensive database is a prerequisite for robust and accurate modeling [37][38][39][40][41] . For this goal, a complete dataset was first undertaken with an extensive range of variations from the petroleum industry in one of the naturally-fractured and low-porosity carbonate reservoirs. This data set includes Stoneley and shear waves travel-time, effective porosity, bulk density, and lithological data (as modeling inputs), and formation permeability (as modeling output). To prepare this database, the RCAL (Routine Core Analysis) data were depth matched with petrophysical data obtained from logging operations. Approximately 16% of the data bank was allocated for model training. The remaining data (about 84% of the total data bank) was used to check the model's ability to predict permeability in a carbonate formation. This allocation is carried out to include the range of parameters along with the entire reservoir zonation. Figure 1 indicates the changes in the input variables such as the effective porosity, bulk density, lithology, and the travel time of shear and Stoneley waves against the formation permeability.

Modeling strategies
EAs. PSO. Inspired by the social life of creatures such as birds and insects, a population-centric algorithm called PSO was created to optimize problems. Initially, random answers, known as particles, are generated 42 . Then, by updating the first generation of problem answers, the optimal answer is obtained 43 . Any possible answer by flying in the space of the problem seeks to achieve the optimal answer 44 . In the PSO method, a cluster of particles is called swarm. In a multidimensional space, particles of a sample population fly to a new location. The particles in this category are affected by the extent to which they are successful in locating their target neighborhood 45 . So, they work together in a neighborhood and society as a whole. Based on this external feature, a neighborhood with specific characteristics can be assigned to a particle. Neighborhoods can be divided into three categories: physical, social, and queen. Interested readers are advised to refer to Sharma and Onwubolu 45 for more information. Assume that the parameters g best, d , P best , id , v i (t), and x i (t), respectively, indicate the best global location, the best previous location obtained by the i-th particle, the velocity vector of i-th particle in the t-th trial and error, and the location of i-th particles in t-th trial and error. Therefore, the velocity vector at each step of the iteration is corrected as follows 44, 46 : In which, c 1 and c 2 represents the learning rate, w shows the weight of the inertia, and r 1 and r 2 stand for the random numbers between zero and one 47 .
According to Eq. (1), velocity vector consists of three main components: inertial, cognitive, and social modules 42,44 . To obtain the new particle location, the previous particle location must be added to its modified velocity vector according to the following equation 44 : where, v id , x id and t indicate the velocity vector of i-th particle, the location of i-th particles, and the number of trial and error. More information about the theoretical description of the PSO can be found in open literature [48][49][50][51][52] .
GA. GA is classified as an evolutionary optimization method that can solve complex phenomena without any formulation of the objective function. In this method, random answers are generated by a random selection of chromosomes. In GA, numerous operators, such as mutation, reproduction, and crossover, are used to produce the best response from preliminary chromosomes. Using the Mutation Factor (MF) and Crossover Factor (CF), a parameter called Offspring Production (OP) is created to describe the binary selection of chromosomes in terms of probability changes between zero and one [53][54][55] . The detailed description of the GA strategy is depicted in the open literature 51,52,[56][57][58][59][60] .
CSA. Modification of the Simulated Annealing (SA) algorithm leads to the development of a new method called Coupled SA (CSA), in which there is no significant reduction in its convergence rate. In order not to get www.nature.com/scientificreports/ into trouble of local optimal points, SA allows us to have a shift from initial to lower quality answers. During processing, the possibility of this displacement decreases. Therefore, the CSA algorithm was introduced to get rid of optima of local type and thus improve optimization. The most important difference between SA and CSA is the so-called parameter of the Acceptance Probability (AP). For the successful optimization of a divergent problem, Suykens and Vandewalle 61 proved that by coupling the local optimization processes, more optimal solutions to the problem could be found 62 .
Predictive techniques. ANFIS. For the first time, Jang 63 combined fuzzy logic and artificial neural networks to develop a model called ANFIS. In this algorithm, the disadvantages of the previous models, namely fuzzy logic and neural networks, are eliminated. The ANFIS algorithm is categorized as a rule-based adaptive method so that these rules are generalized during the model training process 64 . www.nature.com/scientificreports/ Using the IF-THEN fuzzy rule proposed by Takagi and Sugeno 65 , the relationship between inputs and outputs in ANFIS mathematical strategy is determined. The computational process in an ANFIS model with z 1 and z 2 as inputs of the model, and f as simulation output is as follows 64, 66-68 :

Rule 2:
In general, the ANFIS mathematical strategy uses five internal computational layers 63 , which are as follows 63, 64, 66-68 : 1. First layer: input nodes-Each input node with the arbitrary index i is an adaptive node that follows the following rule: In which, O i,1 and µ Ai , in turn, represent the output of i-th node, and membership function of the A i parameter. Typically, the following membership function, known as the Gaussian equation, is applied in calculations 64, 69 : 2. Second layer: rule nodes-Each node in the second layer is considered as a fixed node that multiplies all the input signals and derives the result according to Eq. (7) 63, 64, 66-68 : 3. Third layer: normalization nodes-At this stage, the weight factor is normalized by node i. To do this, the ratio of the i-weight node to the sum of the total weight nodes is calculated using the following formula 63, 64, 66-68 : where, the parameters w i and O 3,i , respectively, show the normalized weight coefficient and the output of the third layer. where, the parameters w i , f i and O 5,i , respectively, show the normalized weight coefficient, the simulation outputs and output of the fifth layer. In this study, the coefficients of the ANFIS model were tuned by using PSO optimization method. For more information, the interested readers are suggested to refer to the available literature 48,70 .
Artificial neural network (ANN). The ANN is one of the first generations of intelligent models for soft computing, which has numerous capabilities such as improving efficiency, adapting to environmental changes, and teaching/learning with an experience-oriented focus 71 . In this parallel distribution method, the simplest element is the neuron processor, which is connected to each other and organized in different layers. Two well-known types of ANN modeling are RBFNN and MLPNN. The MLPNN consists of three layers: input, hidden (intermediate), and output. There are a number of neurons in each layer. The number of neurons in the hidden layer is determined by an optimization method. The relationships governing the neural structure of MLPNN are related to the parameters of the parallel structure problems, and the model training process is performed as structural interconnections. To achieve the best MLPNN structure, inter-neural connections must be established using appropriate optimization algorithms 72 .
In RBFNN modeling, computational process design is simpler than MLPNN approach. For patterns where the MLPNN is not applicable, the RBFNN structure can provide very good feedback from their modeling 73 . Accordingly, RBFNN can be classified as a feed-forward neural network method that has been developed according to the iterative function approximation network and local base function. Due to the faster training process and the simpler structure that RBFNN has than MLPNN, the RBFNN is more popular and preferred in simulation 74 . The RBFNN can provide an ideal answer to any problem related to continuous functions with multiple inputs/ outputs in a limited range. The desired modeling accuracy will be achieved by minimizing the defined Objective www.nature.com/scientificreports/ Function (OF) because the optimal specifications of the regulator network include three linear coefficients. In the RBFNN structure, it uses Radial Basis Function (RBF) as an activation function in the middle layer for each node/neuron. Model parameters include the exact shape of RBF, distance, and scale center. In linear modeling, we are dealing with fixed parameters. Therefore, the best global answer can be obtained with weight coefficients compatible with the minimum error 75 . Schematic structure of the RBFNN modeling is shown in Fig. 2. More information about the theoretical description of the MLPNN and RBFNN can be found in the open literature [76][77][78] .

LSSVM.
A supervised learning method called support vector machine (SVM) was first developed by Vapnik 79 based on statistical learning theory. This numerical structure can be used for a number of issues related to classification and regression. In addition to the many benefits of SVM, its calculations include solving several quadratic equations. This drawback of SVM is solved by introducing a newer SVM version called the LSSVM method. Instead of quadratic programming, linear equations that lead to simplification of the optimization process, are used in the LSSVM strategy 80 . Assume (x i ,y i ) n be a dataset with size n, x i shows input variable and y i is output parameters. The LSSVM method can now be used to calculate any function. The linear regression function can be expressed as follows 81 : As ω represents the weight vector, b represents the constant coefficient and φ(x) refers to a nonlinear equation such as sigmoid, linear, radial basis, and polynomial function. To determine the optimal values of b and ω, the total value of the parameters indicating complexity and empirical risk must be minimized according to the following equation 82 : The loss function independent of ε is shown in the following equation: To optimize, the Lagrangian saddle point will be obtained using the problem presented in Eq. (12) and its limitations shown in Eq. (13) as follows: www.nature.com/scientificreports/ To create a set of optimization equations, a differentiation from the above equation is taken in terms of b, ξ i , ω and ξ * i . As a result, the following set of equations is established: One of the optimization conditions is the validation of the following relationship: The kernel function is written as follows: So, the SVM problem is designed by the following equation 83 : Complex SVM mathematics has reduced the popularity of this model. This is due to the limitations in optimizing the SVM algorithm. Therefore, by using linear programming, this shortcoming in LSSVM has been eliminated 84 . For more details about the theoretical descriptions of LSSVM, numerous researches are available the literature 50,51,56,78 .
CMIS. CMIS, which was first established by Nilsson 85 and then developed as a result of the work of Haykin and Network 86 , is an integration system from several other intelligent methods. Figure 3 shows the structure and functional mechanism of the developed CMIS algorithm in this study. Therefore, the CMIS output can be displayed as follows: where, y indicates the prediction of each expert system. The coefficients a, b, c, d and e are equal to 0.6969, 0.7086, 0.0763, 0.3606, and − 0.1495, correspondingly. These coefficients are optimized by the GA method.

Model development.
In front of dense formations, the Stoneley wave velocity is essentially influenced by the formations and wellbore fluid characteristics. In front of the porous intervals, the shear slowness increases and the Stoneley wave slowness is affected by the formation shear modulus and bulk fluid modulus. At low frequencies, the following equation can be written 23 :  87 . This equation is valid and correct in a formation with zero permeability that the Stoneley slowdown is only affected by the elastic properties of the well and formation.
In front of the permeable zone, the behavior of the Stoneley wave is altered by fluid displacement. As the pore fluid can move relative to the solid matrix, this interaction lowers the energy and velocity of the Stoneley wave. Due to the higher formation permeability or mud fluidity, more Stoneley energy is reduced, and as a result, the wave slows down. Equation (22) links Stoneley slowness to the bulk formation density and shear slowness in the impermeable formations 23 . The following formula can be written in front of the permeable zones: where, DTST exp . and DTST pred. represent, in turn, the measured and calculated Stoneley slowness. Having mud slowness and mud density, DTST pred. can be obtained from Eq. (34) theoretically. Equation (22) is low-frequency estimate of Stoneley wave slowness, which is obtained by simplifying more complex relationships. As a result, tuning the abovementioned parameters existing in Eq. (22) is more reliable instead of direct measurement of the mud property. Equation (22) can be re-arranged as follows: in which, DTST , DTS , DT f , ρ f and, ρ b , respectively, stand for the Stoneley slowness (in μs/ft), the shear slowness (in μs/ft), mud slowness (in μs/ft), the mud density (in g/cc), and bulk formation density (in g/cc). According to Eq. (24), sketching DTST 2 versus DTS 2 ρ b in impermeable (zero porosity or shale) zones leads to a line with slope and intercept values equal to ρ f and DT 2 f , respectively. Thereby, Eq. (24) is determined clearly to be used for calculating DTST . Then, the Stoneley permeability index can be calculated by the following equation 14 : In the above equation, the Stoneley permeability index is shown by KIST , DTST indicates Stoneley slowness, and superscripts exp. and pred. are used to exhibit experimental and predicted values. This index indicates the amount of tortuosity of pore flowing channels better than their volume. Therefore, this index shows the specifications of hydraulic unit in the porous media. Considering Poisseulle model and Darcy's law, the following relationship can be presented to describe the permeability of the hydraulic unit 14 : In Eq. (26), K, FZI and φ e indicate absolute permeability (in md), flow zone indicator and effective porosity (in fraction). A linear relationship can be identified for FZI by using Eq. (26). In the impermeable zones, FZI becomes zero when the Stoneley permeability index reaches one; however, both FZI and KIST values are infinite in the highly permeable zones. Therefore, the following simple relationship can be put forward between FZI and KIST 8 : In Eq. (27), IMF is known as the index matching factor. In this equation, IMF is the only tuning parameter for reaching a satisfactory match between the core permeability and the estimated permeability values by Eq. (26). Due to the fact that the grain modulus affects Stoneley wave slowness, the IMF could be calculated from the following equation 8 : www.nature.com/scientificreports/ The symbols IMF i and V i indicate the index matching factor and volumetric composition of each mineral constituting the rock, correspondingly.
Based on the above theory and bulk of literature studies 8, 16, 18, 20-22, 24, 88 for permeability calculation by Stoneley wave analysis, the following relationship can be proposed: where, RHOB , PHIE , DTST , DTS , DOLOM and CALCIT indicate bulk formation density, effective porosity, Stoneley wave slowness, shear wave slowness, and volumetric composition of dolomite and calcite in the formation lithology. Permeability is a tensor quantity, in which its determination is in need of anisotropy determination. For this goal, full waveforms are processed to determine anisotropy using borehole loggings. The joint study of shear wave elements, including both transverse and vertical components, and Stoneley wave velocity leads to the determination of waveform anisotropy. The main consequence of anisotropy calculation is the determination of permeability, pressure, and stress tensors. In this study, the inclusion of DTST and DTS leads to a more accurate determination of anisotropy in permeability calculation 89 . DTST considers the tube-wise wave propagation through the formation and borehole interface, although the DTS measures the shear-wise movement of the acoustic waves into the porous media. Therefore, the obtained permeability from Eq. (29) could be a proper representative for the rocks with pores as a bulk.
For optimizing the innovative mathematical strategies developed in this study, the well-known cost function, namely Root Mean Square Error (RMSE), is utilized as follows 37 : in which, K exp . , K pred. and N stand for experimental permeability, the predicted values by the smart mathematical strategies in this study, and the size of the dataset applied for modeling.

Results and discussion
As already mentioned, four powerful models were used, including Adaptive Network-Based Fuzzy Inference System, Least Square Support Vector Machine, Multi-Layer Perceptron Neural Network, and Radial Base Function Neural Network. These models were integrated with some of the most important optimization methods available in the literature, such as Genetic Algorithms, Particle Swarm Optimization, and Coupled Simulated Annealing leading to the creation of more robust hybrid connectionist methods. These models include PSO-ANFIS, CSA-LSSVM, GA-RBFNN, and MLP. Afterward, by combining these four mathematical strategies, a stronger model called Committee Machine Intelligent System was developed.
To reasonably evaluate the performance of the methods developed in this study, several investigations were performed, including graphical methods such as cross-plot, contour map, outliers technique, and sensitivity analysis, and parametric methods such as Root Mean Square Error (RMSE), Average Absolute Deviation (AAD) and R-square parameters. To perform a point-by-point analysis of permeability estimation, the permeability profile was sketched along different zones to compare the performance of the models established in this study with the core permeability data and their potential in determining micro-and semi-filled fractures. Finally, the profile of the best model in this study was evaluated via one of the methods derived from the Nuclear Magnetic Resonance (NMR) for estimating permeability.
The adjustable parameters of the utilized smart methods studied here are reported in Table 1. These parameters have been calculated using the prementioned optimization methods that have led to the production of the least error in modeling the permeability based on the Stoneley wave. Table 2 reports the error of intelligent models for various subsets. The definitions of the errors are depicted as follows: www.nature.com/scientificreports/ in which, N , K pred. and K exp . symbolize the number of datapoints used in the modeling process, the predicted, and actual rock permeability values, correspondingly. R 2 parameter shows the degree of fitness of the predicted/calculated permeabilities compared to measured/target data. This statistical parameter varies from zero to unity indicating the poorest fitness (unreasoned relationship between input and output data) and the best fitness (reasonable and logical relationship between input and output data), respectively. AAD exhibits the mean  www.nature.com/scientificreports/ value of the absolute error, in which the lower the AAD parameter results in the lower deviations from actual/ target data. A zero value is desirable for AAD parameter. In addition, RMSE is also used to assess the model's performance, which deals with the average of the square of the error value. The square root of the obtained value would be given as RMSE.
Based on Table 2, the total values of AAD and RMSE are, respectively, equal to 5.13 and 76.72 for CSA-LSSVM, 5.44 and 76.77 for MLPNN, 4.05 and 31.06 for GA-RBFNN, 5.18 and 76.71 for PSO-ANFIS, and 3.79 and 29.39 for CMIS. Based on this, it can be decided that the CMIS and GA-RBF models, in turn, provide the lowest prediction errors in computing the permeability of the carbonate reservoir rock. Also, the error parameters in all smart and deep learning models in both test and learning phases approximately have the same orders of magnitude, which confirms that the over-training problem has not occurred in this study. It is crystal clear that the MLPNN gives the lowest prediction accuracy with the highest AAD and RMSE values, and the smallest amount of R 2 . In other words, the following order is established amongst the accuracy of the proposed models: For comparing the measured core permeabilities with the estimated values by different machine learning techniques suggested here, the so-called parity diagrams are illustrated in Fig. 4. According to this figure, the closer the data cloud is to the "Y = X" line, the more precise the model is to estimate the target parameter. This can be illustrated by the R 2 , which is clearly shown in Fig. 4. The closer the R 2 value to unity leads to the greater accuracy of the modeling. As can be seen, for CMIS and GA-RBFNN models, correspondingly, the highest proximity of the data cloud to line "Y = X" is observed. The values of the R 2 for CMIS and GA-RBFNN models are 0.8726 and 0.8707, respectively.  www.nature.com/scientificreports/ The results of the calculated permeations for the proposed smart methods are shown in Fig. 5 through Asmari formation zone 1 to 5 and top of Pabdeh formation. In this carbonate reservoir, the lithology is mainly made of lime and dolomite. As obviously shown, the core porosity data have a good match with porosity log obtained from petrophysical interpretation. Also, the porosity mainly varies between 5 and 10%, which confirms that the studied carbonate formation is an unconventional reservoir with fairly low porosity. As you can see, the results of the CMIS and GA-RBFNN are, in turn, much better compatible with core permeability data than other models. These two models, in addition to correct prediction of permeability trend, can also be used to detect vuggy, semi-filled and micro-fractures. These fracture indications are identified through the whole wellbore profile in Fig. 5. The available peaks in core permeability log at depths of 2960, 3020, 3107, 3150, 3205, 3227, 3247 and 3268 m give evidence to the claim that the GA-RBFNN and CMIS models have strong potential to successfully detect permeability changes in micro-fractures and semi-filled ones in the low-porosity carbonate formations.
Another available literature method for permeability prediction is derived from NMR logging. Two main techniques for permeability prediction have been developed namely Timur-Coates and Schlumberger-Doll-Research (SDR). Relaxation time method (T2) or SDR has been introduced by Kenyon et al. 90 with the following formulation 23 : where, K SDR represents SDR permeability (in md), φ NMR is the overall porosity of the NMR log (in percentage), and T 2,log shows the average logarithmic representation of the T2 distribution time (in ms). In Eq. (34), the

(a)
Indication of semi-filled, vuggy and micro-fractures  Figure 6 indicates the NMR permeability profiles attained by SDR (i.e., Eq. (34)) and Timur-Coates (i.e., Eq. (35)) throughout Amari formation zone 1 to 5 and top of Pabdeh formation. First of all, the distribution of T2 or relaxation time log shows a good relative correlation with the porosity logs obtained from both core data and petrophysical evaluation. The permeability results from SDR and Timur-Coates models are fairly matched to the core permeability data. In Pabdeh formation, the SDR permeability profile is highly disturbed by borehole washout; therefore, only   (35)) is used for the rest of analysis. For checking the applicability of the best smart method suggested in this study, the CMIS approach is compared with Timur-Coates model (i.e., Eq. (35)), as shown in Fig. 7. Alongside the entire wellbore, the CMIS permeability profile shows highly better agreement with the core permeability log. NMR method (i.e., Eq. (35)) poorly matches with measured permeability data and large deviations can be observed from the actual data; however, CMIS method accurately follows the main permeability variations in all zones. Furthermore, in very high permeable intervals indicating semi-filled, micro-and vuggy fractures, CMIS predicted permeability well matches the actual data, even though NMR log predominantly ignores those fracture indicating layers. Several pieces of evidence for these fracture indications are shown in Fig. 7. The main reason for this great consistency of the CMIS model predictions with the real data is the presence of Stoneley slowness log as one of the inputs to the CMIS model. The Stoneley wave propagates cylindrically in the interface of the wellbore and the formation, which is why it is also called a tube wave. In permeable and cracked zones, the pore spaces in the rock increase and cause the Stoneley wave to be trapped in the caves and fractures. Therefore, the energy of the Stoneley wave decreases, which leads to an increase in the travel time or slowness of the Stoneley wave. Figure 8 indicates the outcomes of the sensitivity analysis via Pearson's technique 92 . As can be seen in this figure, to determine the exact amount of effect of each variable on the output prediction, a normalized value between -1 and + 1 is calculated as the impact value of that variable, which is obtained by the following formula 92 : Indication of semi-filled, vuggy and micro-fractures  www.nature.com/scientificreports/ In the last analysis, a well-known plot named William's diagrams is prepared as illustrated in Fig. 11. The degree of uncertainty in any database reduces the reliability of modeling conducted on that dataset. There are some intolerable errors in the system due to human and equipment errors during the log measurements and interpretations (i.e., lithology, effective porosity, bulk density and slowness of Stoneley and shear waves as input), and core measurements (i.e., permeability as output) 93,94 . This part of the data, which behaves abnormally, is called outliers data. A method called Leverage method or William's plot is used to detect these outliers' data. In this graph, the standardized residuals (vertical axis) are plotted against the Hat value (horizontal axis). The following formula is used to calculate the Hat matrix 93, 94 : where, Hat matrix, the input matrix with m rows (i.e., database size) and n columns (i.e., number of input data), and matrix transpose operation are shown with symbols H, X and superscript T correspondingly. The Hat values are recognized as the main diagonal of Hat matrix. Hat values have a critical restriction, which is defined as below 93, 94 : In above equation, H * shows the critical Hat value. In this study, H * is equal to 0.0082. In the shown outliers' plot (see Fig. 11), a special region, namely applicability domain, is shown in which standardized residual varies between − 3.0 and + 3.0 and Hat value is less than 0.0082. Based on Fig. 11, about 3.7% of the database (94 data) are identified as suspected or outliers' data. This means that the applied database here is valid and the CMIS model is truthful.

Source of error.
In this study, many factors affect the computation error. The database is derived from two different sources, including RCAL and wireline logging. These two measurements are performed separately at different times. Therefore, it is necessary to depth matched the data from RCAL with logging data, which is usually integrated with an error. The logging operation in oil and gas wells can also be affected by some environmental factors such as the mud type and its additives, the borehole diameter, the degree of washout, and tool quality and its calibration. RCAL measurements are influenced by adverse factors such as differences in well and laboratory conditions, sampling method, and human and the device errors. Therefore, there is an initial uncertainty that is noticeable in the database, which means that the error of the created model will never be less than a certain value; hence, the consistency between the predicted and the actual values is almost always not ideal. Clearly, it is difficult to implement a modeling study in the low-porosity carbonate formations, which are full of complex geological features.

Conclusions
In this study, a number of hybrid machine learning methods, including Adoptive Network-Based Fuzzy Inference System (ANFIS) optimized with Particle Swarm Optimization (PSO), Least-Square Support Vector Machine (LSSVM) optimized with Coupled Simulated Annealing (CSA), Radial Basis Function Neural Network (RBFNN) optimized with Genetic Algorithm (GA), Multi-Layer Perceptron Neural Network (MLPNN), and Committee Machine Intelligent System (CMIS), were used to prediction formation permeability based on the guided theory of Stoneley wave propagation. For this purpose, a database including lithological data, bulk formation density, shear and Stoneley slowness, and effective porosity (as the input) and core permeability (as the output) were used in a low-porosity carbonate formation. Approximately, 16% of the database was used to train powerful algorithms, then the rest of the database was used to test and evaluate the models alongside the studied www.nature.com/scientificreports/ low-porosity carbonate formation. A number of statistical error parameters such as RMSE, AAD and R 2 , and diverse graphical analyses such as sensitivity analysis, cross plot and outliers' data analysis were used to evaluate the models as accurately as possible. It is found that the slowness of the Stoneley wave has the greatest effect on the permeability prediction, even though the effective porosity has the least effect. Moreover, the CMIS model with R 2 = 0.87 and RMSE = 29.39 offers the most accurate profile for permeability. Comparing the CMIS model with the permeability profile obtained from Timur-Coates model (i.e., NMR method) reveals that the intelligent model presented in this study has more accuracy and higher agreement with core data. Furthermore, features like vugs, micro-, and semi-filled fractures, which are mentioned in the Routine Core Analysis (RCAL) report, are easily detected by the CMIS model, while the Timur-Coates model has high weakness in diagnosing microfractures. According to the outliers' analysis, it is proven that less than 4% of databank are outliers illustrating the database validity and the model reliability. Finally, it should be noted that the in-depth intelligent learning